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(54) Title: POWER SIGNATURE ATTACK RESISTANT CRYPTOGRAPHY 
(57) Abstract 

This invention provides a method of computing a 
multiple k of a point P on an elliptic curve defined over 
a field, the method including the steps of representing the 
number * as binary vector */, forming an ordered pair of 
point Pi and Pi t wherein the points Pt and P2 differ at 20 
most by P, and selecting each of the bits ki in sequence, 
and for each of the upon A, being a 0. computing a 
new set of points Pi\ P2 by doubling the first point Pi 
to generate the point Pi' and adding the points Pi and P2 
to generate the point Pi' or upon being a 1 . computing 
a new set of points Pt\ /V by doubling the second point 
P2 to generate the point /Y and adding the points Pi and 
Pt to produce the point Pi', whereby the doubles or adds 
are always performed in the same order for each of the bits 
bi, thereby minimizing a timing attack on the method. An 
embodiment of the invention applies to both multiplicative 
and additive groups. 
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POWER SIGNATURE ATTACK RESISTANT CRYPTOGRAPHY 



This invention relates to a method and apparatus for minimizing power signature 
attacks in cryptographic systems. 

5 

BACKGROUND OF THE INVENTION 

Cryptographic systems generally owe their security to the fact that a particular piece 
of information is kept secret without which it is almost impossible to break the scheme. The 
secret information must generally be stored within a secure boundary in the cryptographic 

10 processor, making it difficult for an attacker to get at it directly. However, various schemes 
or attacks have been attempted in order to obtain this secret information. One of these is the 
timing or power signature attack. 

The timing attack (or "side channel attack") is an obvious result of sequential 
computational operations performed during cryptographic operations. The attack usually 

1 5 exploits some implementation aspect of a cryptographic algorithm. 

For example current public key cryptographic schemes such as RSA and elliptic curve 
(EC) operate over mathematical groups; Z\ (n=pq) in RSA, discrete log systems in a finite 
field F\ ( q is a power of a prime), F V or an EC group over these finite fields. The group 
operations, called multiplication modulo «, in RSA, and addition of points in EC are 

20 sequentially repeated in a particular way to perform a scalar operation. In RSA the operand 
is called an exponent, the operation is called exponentiation and the method of multiplying is 
commonly known as repeated square-and-multiply. Thus given a number a e Z * and an 

integer 0 k < p y the exponent, whose binary representation is k = 2 * imc k t 2 a value a k 
mod n may be calculated by repeated use of the "square-and-multiply" algorithm (described 

25 in Handbook of Applied Cryptography P.61 5). Similarly given g(x)^Fpm and an integer 0 < 
k < p m -1 then g(x) k mod f(x) may be calculated by this method. 

On the other hand, in EC the operand is a scalar multiplier, the operation is called 
scalar multiplication of a point, and the method is known as "double-and-add". Thus if k is a 
positive integer and P is an elliptic curve point then kP may be obtained by the "double-and- 

30 add" method. Both these methods are well known in the art and will not be discussed 
further. 
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As mentioned earlier, an attacker once in possession of the private key (either long 
term or session) is able to forge signatures and decrypt secret messages for the attacked 
entity. Thus it is paramount to maintain the secrecy or integrity of the private key in the 
system. 

5 Many techniques have been suggested to obtain the private key. The encryption 

operations are performed either in a special purpose or general-purpose processor operating 
in a sequential manner. Recent attack methods have been proposed in open literature as for 
example described in Paul Kochers's article "Timing attacks on implementations of Diffie- 
Hellman, RS A, DSS and other systems". These attacks have been based on timing analysis 

10 of these processors or in other words timing analysis of 'black box* operations. In one 

instance an attacker by capturing the instantaneous power usage of a processor throughout a 
private key operation obtains a power signature. The power signature relates to the number 
of gates operating at each clock cycle. Each fundamental operation as described in the 
preceding paragraph generates a distinct timing pattern. Other methods exist for obtaining a 

1 5 power signature than instantaneous power usage. 

Laborious but careful analysis of an end-to-end waveform can decompose the order of 
add-and-double or square-and-multiply operations. Using the standard algorithm, either a 
double or square must occur for each bit of either the exponent or scalar multiplier 
respectively. Therefore, the places where double waveforms are adjacent each other 

20 represent bit positions with zeros and places where there are add waveforms indicate bits with 
ones. Thus, these timing measurements can be analyzed to find the entire secret key and thus 
compromise the system. 

In addition to the "square and multiply" or "double and add" techniques mentioned 
earlier, other methods to compute kP are for example the "binary ladder" or Montgomery 

25 method described in "Speeding the Pollard and Elliptic Curve Methods of Factorization" by 
Peter L. Montgomery. In this method the x-coordinates of the pair of points (iP, (i+l)P) are 
computed. The Montgomery method is an efficient algorithm for performing modula 
multiplication, more clearly illustrated by an example. Given a group E (Fp) and given a 
point P on the elliptic curve, the Montgomery method may be used to compute another point 

30 kP. Given an ordered pair of points (iP, (i + 1)P), then for each of the bits of the binary 

representation of k 9 if bit t is a 0 then the next set of points computed is (2iP ,( 2i +1)P) and 
if bit i is 1, then the next set of points is ((2i +J)P,( 2i + 2)P), that is, the first of the pair is 
derived from a doubling or an adding depending on whether the bit is a 0 or 1 . 
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In a processor, each of the doubles and adds involve multiple operations which 
generate unique power signatures. By observing these power signatures as shown 
schematically in figure 1(a), the attacker may derive a sequence of 0s and Is and thus, the 
scalar or exponent being used. 
5 The Montgomery method is preferable in EC cryptographic systems because of its 

extreme efficiency over the straight "double and add" described earlier. 

The attack on the Montgomery method as described above is particularly important if 
performing RSA private key operations. In a recent paper published by Dan Boneh et al 
entitled "An Attack On RSA Given A Small Fraction Of The Private Key Bits", it has been 
1 0 shown that for RSA with a low public exponent, given a quarter of the bits of the private key, 
an adversary can determine the entire private key. With this attack combined with the power 
signature attack described above, the RSA scheme is extremely vulnerable. 

Thus, it is an object of this invention to provide a system which minimizes the risk of 
a successful timing attack particularly when utilizing the Montgomery method on private key 
1 5 operations. 

SUMMARY OF THE INVENTION 

In accordance with this invention, there is provided a method of computing a multiple 
k of a point P on an elliptic curve defined over a field, said method comprising the steps of: 
20 a) representing the number k as binary vector of bits ; 

b) forming an ordered pair of points Pi and P 2t wherein the points Pj and P 2 differ at 
most by P; and 

c) selecting each said bits k,- in sequence; and for each of said k, ; 

i) upon k,- being a 0 
25 ii) computing a new set of points Pj\ Pi by doubling the first 

point Pj to generate said point Pi'; and 
iii) adding the points Pj and P 2 to generate the point P 2 ; 
or upon kj being a 1 

iv) computing a new set of points P t \ P 2 'by doubling the second point P 2 
30 to generate the point P/; and 

v) adding the points Pj and P 2 to produce the point Pj 

whereby said doubles or adds are always performed in the same order for each of said bits b iy 
thereby minimizing a timing attack on said method. 

3 
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In accordance with a further aspect of this invention, the field is either F™ or F p . 
In accordance with a further aspect of this invention, there is provided a processor 
hardware for implementing the method. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

These and other features of the preferred embodiments of the invention will become 
more apparent in the following detailed description in which reference is made to the 
appended drawings wherein: 

Figures 1 (a) and (b) is a schematic representation of a processor power usage 
1 0 signature; 

Figure 2 is a flow diagram of a method according to an embodiment of the present 
invention; 

Figure 3 is a schematic diagram of a symmetric processor implementing a method 
according to an embodiment of the present invention; and 
1 5 Figure 4 is a schematic representation of an integer k in binary. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Referring to figure 2, a generalized algorithm for computing a multiple of a point on 
an elliptic curve defined over a field F 2 m or F P is indicated generally by numeral 20. In this 

20 embodiment, the point P is a parameter of the system. The algorithm computes a multiple of 
the point kP, wherein the scalar k is possibly a private key or other secret value. The scalar k 
is represented in a register as a binary vector having bits 6/24. A pair of elements (a,b) is 
created, where a and b are points on an elliptic curve which differ at most by P or in the case 
of the group F p , a and b are elements g which differ by a multiple g. 

25 In the present embodiment, we will consider an elliptic curve scheme thus, the 

elements a and b correspond to the x-coordinates of an ordered pair of points iP and (i + IP). 
An improved Montgomery method for deriving and utilising the x-coordinates of elliptic 
curve points is described in the applicants pending US patent application serial No. 
09/047,51 8, incorporated herein by reference. A bit b\ beginning with the first bit of the 

30 binary representation of the scalar k is evaluated. Depending on the value of the bit, one of 
two algorithms 26 or 28 are chosen. If the bit is a 0 shown at block 25 a, the first element a of 
the input pair (a,b) is doubled and stored in the first element a of the output pair (a',b'). 
While the first and second elements of the input are added a + b and placed in the second 

4 
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element b' of the output pair (a',b% If the bit is a 1, shown at block 25b, he second element b 
of the input pair (a t b) is doubled and stored in the second element b at the output pair (a'.b'J, 
while the first and second input elements are added, i.e., a+b, and placed in the first element 
a' of the output pair (a',b f )^ These steps are repeated for all bits of the scalar k. 
5 It may be seen thus, from figure 1 (b), that performing the "double" operation 

followed by the "add" operation for each of the bits, produces a consistent power signature 
waveform, thus providing little information to a potential attacker. The operations could also 
be performed in reverse order, i.e., first performing the "add" then the "double" operation. In 
an RSA scheme, the analogous operations are "square and multiply". 

10 More clearly, suppose we are computing kP using the "binary ladder" method, then 

after some iterations we have the x-coordinates of (iP,(i+l)P), i.e. having processed i bits of k 
as shown schematically in figure 4. If the next bit to be processed is 0, then we must 
construct the (ordered pair of) x-coordinates (2iP,(2i+lP)P). If the next bi+b is 1, then we 
must produce the (ordered pair of) x-coordinates ((2i+l)P, (2i+2)P). 

15 It is likely that the "double" formula requires roughly the same amount of power (and 

time) regardless of the input. It is likely that add formulas require roughly the same amount 
of power (and time) regardless of the input. However, an execution of the double formula 
will require a different amount (less, if the usual Montgomery formulas are used) of power 
than an execution of the add formula. 

20 Hence, by monitoring the power bar, we can distinguish between a "double" and an 

"add". Thus, if these equations are executed in a consistent order, then the power signatures 
of a 1 being processed or a 0 being processed are indistinguishable. Each consists of a 
"double" power signature, followed by an "add" power signature. 

We mention that if the order of evaluation is reversed in both cases, then the power 

25 signatures are still indistinguishable. 

Hence, this method for computing kP on an elliptic curve is preferred since it avoids 
revealing the integer k through power consumption statistics. When the "Montgomery" 
"double" and "add" formulas are used, this method is also efficient, especially when the 
projective form is used which avoids inversions. 

30 In the context of efficiencies, it is noted that at each step of the "Binary ladder" 

method, two independent operations must be performed. That is, the results of the "add" 
formula are not needed for the "double" formula and vice versa. This allows for an efficient 
parallel hardware implementation. 
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Thus, referring to figure 3, a schematic parallel hardware implementation of the 
present method is shown by numeral 30. In this implementation, a first and second special 
purpose processor is provided. The first processor 32 performs either a "double" or "square 
or both operations, while the second processor 34 performs a "add" or "multiply" or both 
5 operations. A main processor 36 determines which of the special processor 32 and 34 are 
activated. 

Each processor 32 and 34 are driven simultaneously. (The circuits may take different 
times to execute, however). The inputs and outputs of these circuits are dealt with in 
accordance with the case we are in, i.e., with bit bi=0 or with bit bi=l . This simple instance 

1 0 gives a speed up of almost a factor 2 over a serial implementation. Note that at least in the 
case of the traditional projective Montgomery formulae, the add circuit takes longer and is 
more complicated than the double circuit. Since there is no need to have the double circuit 
finish sooner than the add circuit, it can be slower. In practice, this might mean that the 
double circuit can be built more cheaply. 

1 5 Although the invention has been described with reference to certain specific 

embodiments, various modifications thereof will be apparent to those skilled in the art 
without departing from the spirit and scope of the invention as outlined in the claims 
appended hereto. 
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THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE 
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS: 

1 . A method of computing a multiple k of a point P on an elliptic curve defined over a 
field, said method comprising the steps of: 

a) representing the number k as binary vector k, ; 

b) forming an ordered pair of points Pi and P2, wherein the points Pi and P2 differ at 
most by P; and 

c) selecting each said bits k,- in sequence; and for each of said k t ; 

i) upon ki being a 0, computing a new set of points Pj\ P 2 * by doubling the 
first point P/ to generate said point P/; and adding the points Pi and P2 to 
generate the point P 2 ' ; 

or 

ii) upon k t - being a 1, computing a new set of points P/ \ P2' by doubling the 
second point P2 to generate the point P 2 '; and adding the points Pi and P2 to 
produce the point Pi \ 

whereby said doubles or adds are always performed in the same order for each of said bits b i9 
thereby minimizing a timing attack on said method. 

2. A method as defined in claim 1, said field being defined over/Y". 

3. A method as defined in claim 1, said field being defined overF p . 
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Figure 1. 
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Figure 2. 



(ZH 



20 



25a 



26' 



2a 
a + b 
{2a,a + b) 



/= 1 

k = (i?l,^2i^3, 



( ( 7>,(/+ !)/>)= (fl.fr) 




(fl'.fr') 



1=1+1 



fr,= 1 



2b 
a + b 
(a + fr,2fr) 



24 



25b 



28 



2/4 



WO 00/25204 



PCT/CA99/00919 



Figure 3. 
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Figure 4. 
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